523 research outputs found

    The Most Influential Paper Gerard Salton Never Wrote

    Get PDF
    Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled ???A Vector Space Model for Information Retrieval??? (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specifi c computations. Citations to the phantom paper refl ect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio

    The Search for Structure and the Search for Meaning

    Get PDF
    Statistical approaches to classification emphasize apprehension of structure by an analyst in a group of records. but issues of meaning and semantics are important. despite the focus on structure and algorithms. If meaning and semantics guide formal approaches to classification, can an understanding of structure in a collection of records inform the development of a semantic classification scheme? Data viSualization tools can help human analysts recognize structure and pattern in text and numeric data

    Internal Cohesion and External Separation

    Get PDF
    A 1962 ASLIB Symposium titled "Classification: an Interdisciplinary Problem," led to the foundation of a new society intended to "promote co-operation and interchange of views between those interested in the principles and practice of classification in a wide range of disciplines." A great deal of valuable classification research has been conducted during the 50 years since that symposium, but in 2012 classification research communities are isolated from each other, despite the interdisciplinary connections that were recognized in 1962. In the absence of dialogue across different research traditions, we miss opportunities for progress on some foundational research questions

    The American Tradition in Foreign Policy, by Frank Tannenbaum

    Get PDF

    Sustaining Collection Value: Managing Collection/Item Metadata Relationships

    Get PDF
    Many aspects of managing collection/item metadata relationships are critical to sustaining collection value over time. Metadata at the collection-level not only provides context for finding, understanding, and using the items in the collection, but is often essential to the particular research and scholarly activities the collection is designed to support. Contemporary retrieval systems, which search across collections, usually ignore collection level metadata. Alternative approaches, informed by collection-level information, will require an understanding of the various kinds of relationships that can obtain between collection-level and item-level metadata. This paper outlines the problem and describes a project that is developing a logic-based framework for classifying collection-level/item-level metadata relationships. This framework will support (i) metadata specification developers defining metadata elements, (ii) metadata librarians describing objects, and (iii) system designers implementing systems that help users take advantage of collection-level metadata.Institute for Museum and Libary Services (Grant #LG06070020)published or submitted for publicationis peer reviewe

    Computer-aided Interactive Classification: Applications of VIBE

    Get PDF
    Tools like the VIBE visualization system permit human analysts to use both an understanding of a data set's content and a recognition of structure that the visualization reveals. But what happens when a database's semantics are hidden from the analyst? What guidelines or heuristics can he or she use to reveal the "correct" underlying structure? Results of two experiments conducted at the University of Pittsburgh support the claim that VIBE analysts can uncover a meaningful clustering even without semantic clues. In one experiment artificial data sets were created in which some of the variables discriminate one or more clusters and the other half contribute only random noise. Variable selection guidelines based on computed discrimination value were used in an attempt to distinguish between the signal and noise variables. In a second experiment, a human analyst's encoding of 714 short phrases to 23 overlapping and inter-related categories was stripped of meaningful titles and relabeled with integers. A VIBE analyst was able to highlight relationships among the 23 categories solely on the basisof co-assignment of the phrases
    corecore